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ABSTRACT 

We have used paired-end sequencing of yeast nu- 
cleosomal DNA to obtain accurate genomic maps of 
nucleosome positions and occupancies in control 
cells and cells treated with 3-aminotriazole (3AT), 
an inducer of the transcriptional activator Gcn4. In 
control cells, 3AT-inducible genes exhibit a series of 
distinct nucleosome occupancy peaks. However, 
the underlying position data reveal that each nu- 
cleosome peak actually consists of a cluster of 
mutually exclusive overlapping positions, usually 
including a dominant position. Thus, each nucleo- 
some occupies one of several possible positions 
and consequently, different cells have distinct local 
chromatin structures. Induction results in a major 
disruption of nucleosome positioning, sometimes 
with altered spacing and a dramatic loss of occu- 
pancy over the entire gene, often extending into a 
neighbouring gene. Nucleosome-depleted regions 
are generally unaffected. Genes repressed by 3AT 
show the same changes, but in reverse. We 
propose that yeast genes exist in one of several al- 
ternative nucleosomal arrays, which are disrupted 
by activation. We conclude that activation results 
in gene-wide chromatin remodelling and that this 
remodelling can even extend into the chromatin of 
flanking genes. 

INTRODUCTION 

The DNA of eukaryotic cells is organized into chromatin 
to facilitate packaging into the nucleus and to regulate 
access to genetic information. The basic structural unit 
of chromatin is the nucleosome, which includes the 



nucleosome core, the linker DNA between nucleosomes 
and histone HI (1). The nucleosome core is composed of 
an octamer of the four core histones (H2A, H2B, H3 and 
H4), around which is wrapped ~147bp of DNA in 1.75 
negative superhelical turns. The nucleosome core can be 
isolated as a metastable intermediate, the 'core particle', 
by digesting chromatin with micrococcal nuclease 
(MNase). Indeed, a low-resolution crystal structure of 
native core particles has been described (2). High- 
resolution structures were obtained later using core par- 
ticles reconstituted with defined DNA (3,4). 

Digestion of chromatin by MNase proceeds through 
several stages. Initially, MNase cuts the relatively unpro- 
tected linker DNA, resulting in a series of discrete DNA 
fragments corresponding to integral numbers of nucleo- 
somes (appearing as a nucleosome 'ladder' in an agarose 
gel). Thus, nucleosomes are regularly spaced along the 
DNA in vivo. In most cell types, the average length of 
DNA associated with a nucleosome is ~ 190 bp (5), but 
in budding yeast it is only ~ 165 bp (6,7). Later in diges- 
tion, MNase begins to trim the ends of the linker DNA, 
eventually reaching a transient block in the form of the 
chromatosome, a particle containing ~ 165 bp of DNA 
and HI (8). Finally, MNase removes the remaining 
~20bp of linker DNA to yield the core particle. The 
core particle is relatively stable, but MNase destroys it 
eventually. 

In vitro, the histone octamer binds more strongly to 
some DNA sequences than to others; the strongest of 
these are referred to as positioning sequences [reviewed 
in (9)]. In vivo, the distribution of nucleosomes on DNA 
is also strongly dependent on the underlying sequence 
(10-14), suggesting that eukaryotic DNA possesses a 
nucleosome positioning code (12,13). However, the 
requirement for nucleosome spacing, the presence or 
otherwise of sequence-specific transcription factors 
(14,15) and the activities of ATP-dependent nucleosome 
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mobilizing complexes (11,16) are expected to modify 
how the information specified by the nucleosome code is 
used (17). 

The biological role of nucleosome positioning is a 
subject of intense interest and some controversy 
[reviewed in (18)]. The position of a nucleosome is 
defined by the DNA sequence it occupies i.e. the DNA 
within the core particle. Therefore, all nucleosomes have 
a position with respect to the genomic sequence. A useful 
concept is to imagine the genome as a series of 
overlapping M47-bp windows, each of which has the 
potential to be occupied by a nucleosome (19). The occu- 
pancy of each potential position might be very low (very 
few cells in a population have a nucleosome at this 
position, e.g. a nucleosome-depleted region), or very 
high (maximum occupancy is when all cells in a popula- 
tion have a nucleosome at this position). Thus, a strong 
position is one with a high occupancy (i.e. the same 
nucleosome is present in most cells). 

The advent of DNA microarrays and massively parallel 
sequencing has revolutionized the study of positioning. 
In the first of these pioneering studies, hybridization of 
nucleosomal DNA to microarrays was used to measure 
average occupancies over an entire yeast chromosome 
(20) and later for the entire genome (21). However, this 
method cannot determine nucleosome positions very 
accurately because the precise borders of the nucleosomes 
cannot be ascertained by hybridization. Sequencing of 
nucleosomal DNA can determine positions very accurate- 
ly (22,23) and is now possible on a genome-wide scale. 
Recent genome-wide single-end nucleosome sequencing 
studies have resulted in important insights into nucleo- 
some positioning (13,14,24-27). There is, however, a sig- 
nificant drawback to this approach: only one end of each 
nucleosome is determined. This end sequence might be 
derived from a fully trimmed nucleosome (a core 
particle), thereby providing an accurate position, or it 
might be derived from an incompletely trimmed nucleo- 
some (containing residual linker DNA), or from an 
over-digested nucleosome (cut internally), resulting in an 
inaccurate position. This problem is resolved by 
paired-end sequencing, a refinement of next-generation 
sequencing, which provides sequence reads from both 
ends of the same DNA molecule. Accordingly, after align- 
ment with the genome sequence, the exact length of the 
DNA fragment can be deduced. Paired-end sequencing 
has been used recently to investigate the genomic distribu- 
tions of several classes of MNase-resistant particle derived 
from chromatin (28). 

Accurate positions are crucial for an understanding of 
the sequence determinants of nucleosome positioning. 
A fully trimmed nucleosome core particle should contain 
~147bp, as in the crystal structures (3,4,29). Here, we 
describe the results of a paired-end nucleosome 
sequencing study aimed at defining accurate positions 
for nucleosomes in the budding yeast, Saccharomyces 
cerevisiae. We compare nucleosome positions in control 
cells with those in 3-aminotriazole (3AT)-treated cells. 
3AT inhibits the enzyme encoded by HIS3, which is 
required for histidine biosynthesis, resulting in induction 



of the amino acid starvation pathway through translation- 
al control of the transcriptional activator Gcn4 (30). 

We show that mono-nucleosome preparations are 
composed of a mixture of particles containing DNA of 
different lengths, as expected. We determine accurate 
nucleosome positions by considering the subset of nucleo- 
some sequences derived from core particles (145-150 bp in 
length). Our results are best described using the concept of 
'nucleosome position clusters', which specify sets of 
mutually exclusive overlapping positions and usually 
include a dominant position (11,16,17,31). Thus, each nu- 
cleosome can adopt one of the several alternative pos- 
itions. To account for position clusters, we propose that 
yeast genes exist in one of several alternative, overlapping, 
nucleosomal arrays. 

Activation by 3 AT results in a dramatic loss of canon- 
ical nucleosomes from some genes and the position cluster 
organization of the remaining nucleosomes is disrupted. 
Furthermore, chromatin disruption often extends into 
neighbouring genes. Thus, activation-induced chromatin 
remodelling events are gene wide and can even spread 
farther, disturbing the chromatin of flanking genes. 

MATERIALS AND METHODS 

Preparation of core particle DNA 

YDC111 (MATa ade2-l canl-100 leu2-3,112 trpl-1 ura3-l 
RAD5 + ) (16) was grown to late log phase in synthetic 
complete (SC) medium (control) or SC medium lacking 
histidine to which 3 AT was added to lOmM for 20min 
just before harvesting (3AT treated). Core particle DNA 
was prepared by MNase digestion of nuclei, gel purified, 
repaired and checked for DNA size and quality as 
described (Figure 1) (32). Paired-end sequencing was per- 
formed as described (33). Control cells yielded 16.6 and 
12.3 million aligned paired reads of 40 nt each for the first 
and second experiments, respectively; 3AT-treated cells 
gave 13.1 and 13.5 million aligned paired reads, respect- 
ively. Paired reads were aligned to the S. cerevisiae 
genome using ELAND. Reads with mis-matches were 
excluded from the analysis. The GEO accession number 
for the data presented here is GSE26493. 

Nucleosome positioning 

An algorithm was written to extract nucleosome position- 
ing information from the sequence data. First, it was 
assumed that all nucleosome sequences between 145 and 
150 bp represent accurate positions. These sequences were 
used to define a set of accurate positions (S AC ) adopted by 
nucleosomes, represented by their midpoint coordinates 
(i.e. nucleosome dyad axis). Secondly, to include as 
much of the data as possible, the midpoints of all remain- 
ing reads (those < 145 bp and > 150 bp) were calculated 
and then these reads were allocated to the nucleosome in 
S A c with the closest midpoint, provided that the two mid- 
points were <10 bp apart. Finally, the data were smoothed 
using a 6-bp window. Nucleosome position maps were 
obtained in which the number of sequences corresponding 
to a specific nucleosome position is plotted against the 
chromosomal coordinate of the dyad axis. The scripts 
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Figure 1. Length distribution of nucleosomal DNA. Shown are data 
for 3AT-treated cells. (A) Digestion of yeast nuclei with MNase (30, 60, 
120 and 240 Worthington units). DNA was analysed in an agarose gel 
stained with ethidium bromide. M, pBR322 digested with Mspl. 
Mono-nucleosomal DNA in the samples obtained using 60, 120 and 
240 U of MNase was gel purified and repaired. An aliquot was 
end-labelled using T4 polynucleotide kinase and analysed in (B) a 
native polyacrylamide gel and (C) a denaturing polyacrylamide gel; 
M: 50-bp ladder (NEB). The sample obtained using 120 U MNase 
was ligated to paired-end adaptors and amplified by PCR (D). The 
purified product was used for sequencing (M: pBR322 Mspl). 
(E) Nucleosome sequence length distribution. All sequences are 
included, except those derived from the yeast 2-txm plasmid. The 
fraction of sequences of a given length is expressed as a percentage 
of the total. The numbers indicate peak values. The scale is single 
nucleotide resolution. 



written to analyse the data are given in Supplementary 
Data. 



RESULTS 

Paired-end sequencing of yeast nucleosomes reveals 
complex length distributions 

Nucleosome core particles were prepared by MNase 
digestion of nuclei prepared from control and 3AT-treated 
cells (Figure 1). We obtained between 12 and 17 million 
aligned paired reads per sample. The yeast genome can ac- 
commodate approximately 75 000 nucleosomes, given that 
the haploid genome is ~ 12. 1Mb and the nucleosome 
spacing is ~ 165 bp. Consequently, approximately 200 se- 
quences per nucleosome should be expected. This read 
depth should provide data with low statistical sampling 
error. 

To maximize the fraction of fully trimmed core par- 
ticles, a balance must be struck between the full 
trimming required to obtain accurate position data and 



the tendency for MNase to begin cutting within the core 
particle. A typical nucleosome length distribution 
(Figure IE) suggested the presence of three different popu- 
lations: (i) core particles (peaking at 149 bp); these are 
mono-nucleosomes with little or no linker DNA remain- 
ing. Consequently, their DNA content defines accurate 
positions; (ii) mono-nucleosomes with residual linker 
DNA (peaking at ~ 157 bp and at ~ 165 bp). Incomplete 
trimming might reflect the binding of H 1 , which is present 
at relatively low levels in yeast and so only some nucleo- 
somes would be expected to contain it (34), although 
poorly trimmed nucleosomes of about chromatosome 
size have been observed even in the absence of HI (5); 
and (iii) subnucleosomal particles containing less than 
~ 140 bp. These probably derive primarily from internal 
cleavage of core particles by MNase, perhaps following 
spontaneous uncoiling of DNA from the ends of the nu- 
cleosome (35). Alternatively, some might represent 
remodelled nucleosomes or transcribed nucleosomes 
lacking an H2A-H2B dimer (36,37). The data presented 
below belong to the first of two independent experiments, 
which gave essentially the same results. 

Position clusters on the PH05 promoter 

The PH05 promoter was chosen as a control region 
because it is one of the best studied loci in the chromatin 
literature. Mapping of the PH05 promoter by indirect 
end-labelling has established that the repressed promoter 
is organized into an array of positioned nucleosomes 
numbered —1 to —5 (38). There is a gap between nucleo- 
somes —2 and —3, where binding sites for the transcription 
factors Pho4 and Pho2 are located. Induction disrupts this 
ordered chromatin structure and increases accessibility of 
the promoter DNA (38). In our experiments, cells were 
grown under conditions such that PH05 should be 
repressed. 

The nucleosome occupancy profile is a plot of the 
chromosome base coordinate versus the number of nu- 
cleosome sequences that contain that particular base. It 
is therefore a measure of the probability of a base being 
contained within a nucleosome. Occupancy profiles for the 
PH05 promoter in control and 3AT-treated cells are 
shown (Figure 2A); all aligned nucleosome sequences 
were included. The data were not subjected to mathemat- 
ical manipulation, except that the 3AT data were 
multiplied by 1.27 to compensate for the fact that fewer 
total sequences were obtained relative to the control. The 
agreement between the profiles for control and 
3AT-treated cells was excellent; the traces superimposed 
in places and showed limited quantitative variation. 

The occupancy profiles for the PH05 promoter 
exhibited peaks corresponding to the five reported nucleo- 
somes (Figure 2A). Importantly, although the peaks were 
quite obvious, the troughs between them did not dip close 
to the baseline, indicating that many nucleosome 
sequences included what should be linker DNA between 
the reported nucleosomes. To assess whether this was due 
to poorly trimmed nucleosomes (i.e. from nucleosomes 
significantly > 150 bp and therefore including some linker 
DNA), the plot was restricted to data for nucleosome 
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Figure 2. Nucleosome position clusters on the PH05 promoter, TRP1 ARS1 and GAL1-GAL10. Chromatin structures of well-studied genes. 
Control cells: red trace; 3AT-treated cells: green trace. Grey lines indicate 250-bp intervals. (A) Nucleosome occupancy of the PH05 promoter 
and upstream PBY1 gene. All sequences are included. Nucleosomes are indicated by ovals drawn to scale and numbered (38). (B) Nucleosome 
occupancy as in A, except that only reads of 145-1 55 bp were included. Nucleosome positioning analysis for (C) control and (D) 3AT-treated cells. 
The PH05 coding region was omitted because it is very homologous to other yeast genes (PH03, PH012 and DIA3) resulting in removal of some 
sequences because their origin is uncertain. (E) TRP1 ARS1. Occupancy profiles for control and 3AT-treated cells (all sequences included) and 
position analysis for control cells. Nucleosome ovals are drawn to scale and numbered (39). Our strain is trpl-1, which corresponds to a nonsense 
mutation (Asterisk) covered by the second nucleosome peak on TRP1, which is much reduced relative to the others, because sequences containing 
this mutation are rejected as they do not match the wild-type sequence. (F) GAL1-GAL10. Occupancy profiles for control and 3AT-treated cells (all 
sequences included) and position analysis for control cells. Black boxes: Gal4-binding sites. Grey boxes: Rebl sites involved in expression of a 
ncRNA beginning within GAL 10 (40). The nucleosome oval is drawn to scale. 
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sequences 145-155 bp in length (Figure 2B), correspond- 
ing to 50% of all nucleosomes. The occupancy profiles 
were marginally sharper (Figure 2B), but not essentially 
different from the profiles corresponding to all nucleo- 
somes (Figure 2A). This suggests that the excluded nucleo- 
somes (those > 155 bp and < 145 bp) are derived from the 
same nucleosomes in the restricted data set (145-1 55 bp). 

The accurate nucleosome positions are those between 
145 and 150 bp, corresponding to core particles 
(comprising ~30% of nucleosome sequences from 
control and 3AT-treated cells). These positions were 
defined by their midpoints (dyad axes). To include as 
much data as possible, the midpoints of all remaining 
sequences (those < 145 bp and > 150 bp) were calculated 
and then assigned to the accurate position with the 
closest midpoint. These simple rules yield a position map 
in which the number of sequences corresponding to the 
midpoint of each specific nucleosome position is plotted 
against the chromosomal coordinate of the dyad axis. 

The position map for the PH05 promoter showed that 
some positions were strongly favoured, particularly those 
corresponding to nucleosome —3 and the PBY1 coding 
region (Figure 2C, D). However, in all cases, there were 
some less prominent midpoints near each major midpoint, 
corresponding to positions which overlap the major 
position to different extents. Thus, each dominant 
midpoint was associated with a cluster of midpoints, 
which we term a 'position cluster'. Each position cluster 
specifies a set of overlapping positions, which are mutually 
exclusive because canonical nucleosomes cannot physical- 
ly overlap on the same DNA molecule. Therefore, in some 
cells the nucleosome is present at the dominant position in 
each cluster while in other cells, the nucleosome is at 
one of the alternative positions. All of the PH05 
promoter nucleosomes corresponded to clusters including 
a dominant position, except nucleosome —4, which was 
a cluster of alternative positions with similar probabilities. 
PBY1 also exhibited position clusters with a particularly 
dominant position for the first nucleosome. In conclusion, 
the chromatin structure of the PH05 promoter is best 
described in terms of nucleosome position clusters, 
rather than uniquely positioned nucleosomes. 

Position clusters on TRP1 A RSI and at the 
GAL1-GAL10 locus 

Immediately downstream of TRP1 is ARS1, a well-studied 
replication origin. Examination of the chromatin structure 
of a TRP1 ARS1 plasmid by indirect end-labelling has 
revealed a hypersensitive site at the ARS consensus 
sequence (ACS), where the origin recognition complex 
binds, and three well-positioned nucleosomes (39). Our 
occupancy profile for TRP1 A RSI showed the expected 
nucleosome-depleted region at the ACS and three clear 
nucleosome peaks downstream (Figure 2E). Once again, 
the agreement between the profiles for control and 
3 AT- treated cells was excellent. As for the PH05 
promoter, analysis of nucleosome positioning on TRP1 
and ARS1 indicated the presence of position clusters, 
rather than unique positions (Figure 2E). 



The GAL1 and GAL10 genes are transcribed from a 
divergent promoter and should be repressed under our 
growth conditions. A quite regular series of nucleosome 
occupancy peaks was observed across both genes, which 
corresponded to a series of position clusters (Figure 2F). 
There were nucleosome-depleted regions in the divergent 
promoter [corresponding to Gal4-binding sites), within 
the GALlO-QO&mg region (corresponding to Rebl sites 
required for activation of a ncRNA gene (40)] and at the 
FUR4 promoter. 

In conclusion, nucleosome position clusters were 
detected at all three regions examined (Figure 2), 
indicating that these genes can exist in any of several 
alternative chromatin structures. 

Altered position clusters on HIS3 in 3AT-treated cells 

Previously, we mapped nucleosome positions on HIS3 at 
high resolution using the monomer extension technique 
(16). In the absence of the Gcn4 activator, HIS 3 is 
organized into a dominant array of five nucleosomes, 
D1-D5, with a background of alternative, overlapping 
positions. Activation by Gcn4 results in increased occu- 
pancy of the alternative positions, and the D-positions are 
no longer dominant. This study (16) provides positioning 
data of sufficiently high resolution for direct comparison 
with the current study. 

Both HIS3 and the neighbouring PET56 gene are 
induced by 3 AT (30). The occupancy profiles indicated 
that HIS3 is flanked by nucleosome-depleted regions, cor- 
responding to the HIS3-PET56 and DED1 promoters 
(Figure 3 A). The profile for control cells indicated five 
nucleosome peaks, although the separation between the 
third and fourth peaks was relatively indistinct and their 
occupancies were lower. The profile for 3AT-treated cells 
was somewhat different: the distinction between the third 
and fourth nucleosome peaks was even less clear, the fifth 
nucleosome peak was shifted a little upstream, and the 
overall occupancy was lower. The effect of 3AT on 
PET56 was more subtle, with slightly reduced occupancy 
at both ends of the coding region, but no obvious change 
in the fairly regular set of nucleosome peaks. 

Five position clusters were present on HIS3 in control 
cells (Figure 3C). The midpoints of the most prominent 
peaks were +15, +179 (with slightly weaker peaks at +156 
and +203), +327, +491 and +642/+672. These midpoints 
predict an array with a range of linker lengths and an 
average spacing of 164 bp (typical of yeast). They corres- 
pond reasonably well to the five dominant positions 
mapped previously (16): +8, +163, +327, +527 and 
+683, except for D4 that was mapped at +527 rather 
than at +491. In the paired-end data, the strongest peak 
in the cluster was at +491, but there was a smaller peak at 
+516. 

There were significant changes in the position clusters 
on HIS3 in 3AT-treated cells (Figure 3D). In the first 
cluster, the Dl position remained the most probable but 
its dominance was reduced relative to an overlapping 
position ~20-bp downstream. The same was true of the 
fifth cluster in which the dominance of the +672 position 
was reduced relative to that at +642. The major changes 
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Figure 3. Altered nucleosome position clusters on HIS3 in response to 
3 AT. (A) Nucleosome occupancy on HIS3 and flanking sequences. 
Control cells: red trace; 3AT-treated cells: green trace. All sequences 
are included. Nucleosomes are indicated by ovals drawn to scale. Grey 
lines indicate 250-bp intervals. (B) Nucleosome occupancy using only 
reads of 145-1 55 bp. Positioning analysis for (C) control cells and 
(D) 3AT-treated cells. D1-D5: dominant positions adopted by nucleo- 
somes in cells lacking the Gcn4 activator (16). The asterisk indicates the 
new position cluster formed in place of D3 and D4 in 3AT-treated cells. 



occurred in the D2, D3 and D4 clusters: the dominant 
position in the D2 cluster was shifted downstream to the 
+203 position; the D3 and D4 clusters were replaced by a 
very weak cluster centred on +425. Thus, the 3AT-induced 
HIS3 gene had only four nucleosomes on average, rather 
than the array of five in control cells. Since the positions 
adopted by the nucleosome at each end of the array 
(Dl and D5) did not change significantly, the average 
spacing of the nucleosomes was much greater on the 
3AT-induced gene (218 bp) than in control cells (164 bp). 

In conclusion, the paired-end data are in good agree- 
ment with our previous monomer extension studies, 
providing some validation. Induction with 3AT reduced 
the occupancies of the dominant positions relative to the 
alternative positions and resulted in removal of one nu- 
cleosome and some re-positioning of the remaining 
nucleosomes. 

Activation disrupts the chromatin structures of ARG1 
and the neighbouring YOL057W gene 

ARG1, another Gcn4-dependent gene, is strongly induced 
by 3 AT (30). In control cells, ARG1 was organized into 
nine nucleosome peaks, flanked by nucleosome-depleted 
regions corresponding to the ARG1 and YOL057W pro- 
moters (Figure 4A). The 5' half of YOLO 57 W was also 
well organized, displaying six nucleosome peaks before 
becoming more irregular. In 3AT-treated cells, there was 
a massive loss of occupancy across the entire ARG1 gene, 
extending into the 3 / -flanking gene (YOL057W) 
(Figure 4A), even though its expression is not affected 
by 3 AT (30). Furthermore, the regular nucleosome 
peaks observed on ARG1 in control cells merged into 
one another in 3AT-treated cells. The position clusters 
present on ARG1 and YOLO 57 W in control cells were 
heavily disrupted in 3AT-treated cells (Figure 4B, C). 
Thus, ARG1 induction was associated with loss of more 
than half of its canonical nucleosomes; those remaining 
were no longer organized into clusters with dominant 
positions. Furthermore, these effects were propagated 
downstream into YOL057W. 

More genes exhibiting 3AT-induced disruption of position 
clusters 

To find other genes displaying similarly dramatic, 
3AT-induced effects on chromatin structure, the 
numbers of nucleosome sequences per coding region in 
control and 3AT-treated cells were compared using a 
whole-genome survey. This analysis ranked all genes 
using a 'disruption score', corresponding to the ratio of 
nucleosome sequences in 3AT-treated cells to sequences in 
control cells (after adjustment for the difference in the 
total number of nucleosome sequences obtained for 
control and 3AT-treated cells). A disruption score of <1 
indicates that a gene has fewer nucleosome sequences in 
3AT-treated cells, like ARG1. A cut-off score of 0.75 was 
set, requiring that a gene has >25% fewer sequences in 
3AT-treated cells. Forty-nine genes, including ARG1, had 
an average disruption score of <0.75 (Table 1). 
In addition, 13 genes showed the reverse effect, with an 
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Figure 4. Induction of ARG1 results in reduced occupancy on the 
coding region, extending into YOLO 57 W downstream. ARG1 encodes 
an enzyme involved in arginine biosynthesis. (A) Nucleosome occu- 
pancy of the ARG1 gene and flanking sequences. Control cells: red 
trace; 3AT-treated cells: green trace. All nucleosome sequences are 
included. The oval indicates the size of a nucleosome. Grey lines 
indicate 250-bp intervals. Nucleosome positioning analysis for control 
cells (B) and 3AT-treated cells (C). 



equivalent cut-off score of > 1 .32 (25% fewer nucleosome 
sequences in control cells than in 3 AT- treated cells). 

The expression microarray study (30) found 305 genes 
that are induced >2-fold by 10 mM 3AT and 104 genes 
that are repressed > 2-fold. Of the 49 genes with disruption 
scores equal to or <0.75, 29 were induced >2-fold [(30), 
Table 1]. If the three genes for which there are no data are 
excluded, 63% of genes with heavily disrupted chromatin 
are induced by 3AT. In addition, four of the genes un- 
affected by 3AT are located next to genes that are induced 
by 3AT (ER V2\ Y PRO 36 W-A , YSC83/ARG4, YIR035C/ 
LYS1 and COX9/IDP1). Of the 13 genes with heavily dis- 
rupted chromatin in control cells, three are repressed by 
3 AT (33% of genes for which data are available; Table 1). 
Genome-wide, there was a good correlation between the 
disruption score and the fold induction by 3AT 



(Supplementary Figure SI). A small fraction of genes 
which were strongly induced or repressed by 3AT 
showed only weak chromatin disruption. This could be 
because the fold-change in expression of these genes is 
high but the absolute level of transcription is not very 
high (see 'Discussion' section). It should also be noted 
that the 3AT expression data are for a different yeast 
strain grown in a different medium (30) and so some 
genes might be affected differently by 3 AT. 

By comparing changes in expression in wild-type and 
gcn4A cells, Natarajan et al. (30) reported a list of 539 
Gcn4 target genes. Of the 46 genes with disrupted chro- 
matin structure in 3 AT- treated cells and for which there 
are expression data, 38 are Gcn4 targets (83%). Only eight 
genes are not Gcn4 targets and three of these are neigh- 
bours of affected genes (Table 1). As expected, none of the 
genes with disrupted chromatin in control cells are Gcn4 
targets. Thus, most of the genes identified by the disrup- 
tion survey are known Gcn4 targets and are induced 
by 3AT. 

Induction with 3AT results in loss of canonical 
nucleosomes from coding regions and rearrangement of 
the remaining clusters, with little effect on 
nucleosome-depleted regions 

Occupancy profiles for some genes with heavily disrupted 
chromatin structures identified by the whole-genome 
survey (Table 1) are shown (Figure 5). In all cases, there 
was a dramatic loss of occupancy over the coding region 
in 3AT-treated cells. In control cells, the chromatin struc- 
tures of LYS1 (Figure 5D), the 5'- and 3'-ends of HIS4 
(Figure 5B) and the 5 / -half of IDP1 (Figure 5E) were quite 
regular, displaying well-defined nucleosome peaks, corres- 
ponding to position clusters with dominant positions. 
The chromatin structures of ARG4 (Figure 5C), ICY2 
(Figure 5A), the central region of HIS4 and the 3' half 
of IDP1 were less regular, indicating a more complex 
position cluster organization. The occupancy profiles of 
these genes in 3AT-treated cells were very different from 
those of control cells, indicating that the remaining nu- 
cleosomes had been rearranged. In the case of HIS4, all 
regularity was lost, indicating the absence of dominant 
positions (Figure 5B). The occupancy profile of LYS1 
remained quite regular in 3AT-treated cells, but there 
were only six clear peaks, which were out of phase with 
the peaks in control cells, with the exception of the sixth 
peak (Figure 5D). This indicated a change in the average 
positions and spacing of the nucleosomes, as observed for 
HIS3 (Figure 3). 

In striking contrast to the effects of 3AT on nucleosome 
occupancy of the coding regions, there was little effect on 
occupancy at the promoters and 3'-ends of these genes, 
which were all significantly depleted of nucleosomes in 
both control and 3AT-treated cells. 

Extension of chromatin disruption into flanking genes 

The disruption of ARG1 chromatin in 3AT-treated cells 
extended far into the gene downstream (Figure 4A). This 
was also true for HIS4 and ARG4 (Figure 5). None of 
these downstream genes are Gcn4 targets and all are 
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Table 1. Genes with disrupted chromatin structure in 3AT-treated or control cells 
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Table 1. Continued 
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Rank a 


Gene 


Disruption 

score, 
average b 


Disruption 
score, SD b 
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a A total of 5541 genes were ranked according to their average disruption scores (tRNA genes and some very short ORFs < 170 bp were eliminated 
from the ranking, because they are too short). 

b Disruption score = no. of sequence reads in 3AT-treated cells/ no. of reads in control cells (normalized for the difference in the total number of 
reads). Average disruption scores for two independent experiments are given with standard deviation (SD) in the next column. Genes with average 
scores <0.75 (25% fewer reads in 3AT-treated cells; 49 genes), or > 1.33 (25% more reads in 3AT-treated cells; 13 genes) are shown. Also included 
are the values for other genes described in the text. 

c The average disruption score is expressed as a natural log to facilitate comparison of the extent of disruption in 3 AT treated and control genes. The 
cut-off scores are —0.28 for genes with disrupted chromatin in 3AT-treated cells and +0.28 for genes with disrupted chromatin in control cells, 
interpolated from expression microarray data (30). ND: no data reported for this gene. Although most genes with disruption scores equal to or 
<0.75 are induced by 3AT, two genes (CIT2 and DIP5) are repressed by 3AT; the explanation for this is unknown. 
e Gcn4 target gene in a set of 539 genes (30). 
ND, no data reported. 



unaffected by 3AT (30). Indeed, nucleosome occupancy 
on YSC83, downstream of ARG4, was reduced so 
strongly that it was scored as a gene with heavily disrupted 
chromatin structure (Table 1). In all three cases, occu- 
pancy of the downstream gene at the end farthest from 
the target gene was similar to that in control cells, reveal- 
ing that the disruptive effect diminished with distance 
from the target gene. 

The chromatin structures of the genes downstream of 
ICY 2, LYS1 and ID PI were unaffected (Figure 5). 
However, in these cases, the chromatin structure of the 
upstream gene was disrupted. Most strikingly, YIR035C, 
upstream of LYS1 (Figure 5D), was heavily disrupted, 
even though it is not a Gcn4 target gene and is unaffected 
by 3AT (30). 

In summary, the chromatin structures of 49 genes were 
heavily disrupted in 3AT-treated cells: the entire coding 
region was heavily depleted of canonical nucleosomes. In 
some cases, this disruption extended to flanking genes, 
either upstream or downstream. Nucleosome positioning 
was heavily disrupted with major reductions in 
occupancies of dominant positions observed in control 
cells. At HIS3 and LYS1, the average number of nucleo- 
somes on the gene was reduced, implying changes in nu- 
cleosome spacing. 

The chromatin of genes repressed by 3AT is more ordered 
in 3AT-treated cells 

There were 1 3 genes with extreme disruption scores in the 
opposite sense: more nucleosome sequences were obtained 
from 3AT-treated cells than from control cells (Table 1). 
Three of these genes (URA1, OLE1 and MOG1) are 
repressed by 3AT (30). MOG1 had a more ordered chro- 
matin structure in 3AT-treated cells than in control cells 
(Figure 6A), with four nucleosome peaks corresponding to 
position clusters (Figure 6A). In control cells, the first 
nucleosome peak on MOG1 and the corresponding 
position cluster were greatly diminished. MOG1 shares a 
divergent promoter with OPI3, the coding region of which 



was somewhat depleted of nucleosomes (Table 1), but it is 
unclear whether this effect was communicated from 
MOG1, because OPI3 is also repressed by 3 AT (30). The 
gene with the most disrupted chromatin in control cells 
was URAL There was a major loss of occupancy over the 
coding region in control cells relative to 3AT-treated 
cells (Figure 6B). All nine position clusters located 
between the nucleosome-depleted regions flanking URA1 
in 3AT-treated cells were disrupted in control cells 
(Figure 6B). In conclusion, MOG1 and URA1 are re- 
pressed by 3AT (30) and their chromatin structures 
showed the opposite transition from 3AT-induced genes. 

DISCUSSION 

Advantage of paired-end sequencing for mapping 
nucleosome positions 

Nucleosome length distributions indicate that each sample 
contains fully trimmed nucleosome core particles, together 
with some incompletely trimmed nucleosomes and 
damaged core particles. This was expected because of 
MNase digestion kinetics, the possible influence of HI, 
and slower trimming of the final ~20 bp of linker DNA. 
DNA length is essential information for determining 
accurate nucleosome positions. If the DNA is significantly 
> 1 50 bp, there is uncertainty in the position of the nucleo- 
some, because it occupies only 145-1 50 bp. If the DNA is 
significantly < 145 bp, the position is also unclear, because 
the nucleosome from which it is derived must have been 
cleaved internally or trimmed excessively from one or both 
ends. Sequences which are too long or too short can be 
selectively excluded from paired-end data, but not from 
single-end data. 

In our analysis, we consider only DNA fragments of 
approximately mono-nucleosome size (any protected 
DNA fragments much larger or much smaller than the 
nucleosome are not present in our data sets, because the 
DNA was gel purified). Thus, we are considering only 
canonical nucleosomes, which are really defined by their 
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Figure 5. Nucleosome occupancy profiles of 3AT-induced genes with severely disrupted chromatin structure. Occupancy profiles of genes with the 
most disrupted chromatin structure in 3AT-treated cells (Table 1). Control cells: red trace; 3AT-treated cells: green trace. All nucleosome sequences 
are included. Nucleosomes are indicated by ovals drawn to scale. Grey lines indicate 250-bp intervals. (A) ICY2 (ranked second). (B) HIS4 (ranked 
fourth). (C) ARG4 (fifth) and YSC83 (ninth). The disorganized central region of ARG4 most likely reflects the presence of an origin of replication 
(ARS2) within the coding region. (D) LYS1 (sixth) and YIR035C (49th). (E) IDP1 (10th). All of these genes are induced by 3AT and are Gcn4 
targets (30) (Table 1). All have canonical Gcn4-binding sites in their promoters except ICY 2, which has two non-canonical sites. 



ability to protect ~147 bp. We expect that there might be a 
small fraction of ~147-bp sequences scored as canonical 
nucleosomes that are not canonical nucleosomes. Some of 
these sequences might correspond to internal cleavage 
sites in neighbouring nucleosomes. If so, such sequences 
would have to contain an intact linker, which is unlikely 
and cannot be quantitatively very significant because such 
cleavages would smear the nucleosomal repeat pattern. 
Another possible problem is that a transcription factor 



bound adjacent to a nucleosome might protect some 
linker DNA after the very extensive digestion used to 
make core particles, but we are not aware of any studies 
indicating that transcription factors offer strong protec- 
tion against MNase digestion. It is worth noting that tran- 
scription factors bind reversibly to DNA (unlike histones 
in the nucleosome) and so would be expected to offer less 
protection. In addition, even histone HI, which binds 
tightly to the nucleosome, offers only transient protection 
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Figure 6. Repression of MOG1 and URA1 by 3 AT results in 
re-ordering of disrupted chromatin structure. Occupancy profiles and 
position cluster analysis for two genes with disrupted chromatin struc- 
ture in control cells (Table 1) that are repressed by 3 AT (30) (Table 1). 
Control cells: red trace; 3AT-treated cells: green trace. All sequences are 
included. Nucleosomes are indicated by ovals drawn to scale. Grey 
lines indicate 250-bp intervals. (A) Nucleosome occupancy and pos- 
itioning on MOG1. (B) Nucleosome occupancy and positioning on 
URAL 



from MNase under these conditions. Overall, we believe 
that the vast majority of M47-bp sequences are indeed 
canonical nucleosomes. 

There is potential for bias in genome-wide sequencing 
studies, particularly because two different DNA amplifi- 
cation steps are involved. We do not believe that bias is a 
major problem in our study because: (i) we have validated 
our current mapping data by comparison with some 
famous examples in the classical literature (Figures 2 
and 3); (ii) the average nucleosome occupancy is very con- 
sistent across the genome; and (iii) our data are very 
reproducible. 

Biological implications of position clusters 

The chromatin structures reported here for the PH05 
promoter and TRP1 ARS1 (Figure 2) are consistent 
with previous studies using low-resolution indirect 
end-labelling (38,39). However, the higher resolution 
provided by paired-end sequencing reveals that each pos- 
itioned nucleosome reported by indirect end-labelling is in 
fact an average of several overlapping positions (a 
position cluster). We and others have described similarly 
complex chromatin structures previously (11,16,23,31,39), 
but it has been generally assumed that these are atypical. 
More recently, complex chromatin structures have been 
noted genome wide in Caenorhabditis elegans (27). The 
present study demonstrates that complex chromatin struc- 
tures are the rule in yeast chromatin, not the exception. 

We define a position cluster as a set of overlapping pos- 
itions, usually including a dominant position (Figure 7A). 
These must be alternative positions, because canonical nu- 
cleosomes cannot physically occupy the same DNA. In 
this context, it is worth noting that in vitro, a nucleosome 
can invade the territory of a neighbouring nucleosome, 
resulting in the loss of one H2A-H2B dimer and 
forming a particle that protects ^250 bp from MNase di- 
gestion (42). If such coalesced nucleosomes are present in 
yeast, they would not appear in our maps because they 
protect much > 147 bp. 

In a particular cell at a given moment, the nucleosome 
represented by a position cluster occupies one of the pos- 
itions within the cluster. Thus, in some cells, the nucleo- 
some will occupy the dominant position; in other cells, it 
will be at one of the alternative positions. This observation 
has important biological implications. For example, many 
models proposed for the regulation of specific genes 
depend on precise positions adopted by nucleosomes at 
the promoter, with critical transcription factor-binding 
sites located in the linker DNA, or just inside the nucleo- 
some core, rather than in the inaccessible centre. Our data 
imply that factor-binding sites at nucleosomal promoters 
(e.g. PH05), might be accessible in some cells, but not in 
others. It seems likely that remodelling machines will play 
critical roles here, because they are able to move nucleo- 
somes along the DNA, perhaps from one position in a 
cluster to another, perhaps exposing or obscuring 
specific factor-binding sites. Furthermore, there is poten- 
tial for stochastic effects, given that apparently identical 
cells can have different chromatin structures. 
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Figure 7. Alternative arrays can account for position clusters. (A) A 
position cluster with a central dominant position and four alternative 
positions. Peaks indicate nucleosome dyad positions. (B) Occupancy 
profile and position midpoints for an array of five perfectly positioned 
nucleosomes. Nucleosome: 145 bp; linker: 20 bp. Smoothed with a 
25-bp moving average. (C) Occupancy profile and position midpoints 
for five alternative arrays of five nucleosomes: a dominant array (dark 



Position clusters and nucleosome spacing 

Although the existence of position clusters indicates that 
chromatin structure is more complex than has been gen- 
erally acknowledged, a significant simplifying factor is 
that nucleosomes in yeast are regularly spaced with an 
average linker length of 15-20 bp (a 160- to 165-bp 
repeat). Consequently, we propose that each position 
cluster corresponds to positions belonging to alternative 
arrays with the same spacing (17). An array of perfectly 
positioned nucleosomes predicts a 'square-wave' occu- 
pancy profile (Figure 7B), which is not generally 
observed. The only obvious example of a square nucleo- 
some occupancy peak that we have found in our data is 
the single nucleosome located over each centromere, 
which is therefore perfectly positioned, but this nucleo- 
some is unusual in that it contains CenH3 (Cse4), 
a variant of H3 (33). 

A set of five overlapping arrays with the same spacing 
(165 bp) predicts a profile similar to the more regular 
profiles and position clusters observed experimentally 
(Figure 7C); this example is just one of many possibilities. 
Most arrays must have the same spacing to yield the 
observed bulk chromatin repeat of 165 bp, but quantita- 
tively rare arrays could have quite different spacing. 
An interesting example is the square wave interference 
pattern generated in the case where half the cells have 
an array of five nucleosomes on gene X (165-bp spacing) 
and the other half have an array of four nucleosomes 
(220-bp spacing), beginning and ending with the same 
nucleosome (Figure 7D): both outermost nucleosomes 
give rise to a clear nucleosome peak in the occupancy 
profile, but the inner nucleosomes contribute an irregular 
pattern, including sharp spikes. Counter-intuitively, a 
well-positioned nucleosome is located below the central 
trough in the occupancy profile (Figure 7D). Thus, both 
regular and irregular occupancy profiles could be 
accounted for by overlapping regular arrays. 

If a spacing factor begins at one nucleosome-depleted 
region and terminates at the next, then nucleosomes on a 
gene might be subjected to spacing from both ends, result- 
ing in at least two alternative arrays. Little is known about 
how nucleosomes are spaced in yeast in vivo. In vitro, the 
yeast ISW1 and INO80 complexes can create arrays with 
~175-bp spacing, and ISW2 can assemble arrays with 
~200-bp spacing (43,44). In Drosophila, there are two 
well-characterized nucleosome spacing factors, ACF and 
CHD1 (45,46). How the activities of spacing factors 
interact in terms of array formation in vivo is an important 
question. 



grey ovals; relative occupancy =1); two arrays shifted by 20-bp 
upstream and downstream of the dominant array (light grey ovals; 
relative occupancy = 0.5); two arrays shifted by 40-bp upstream and 
downstream of the dominant array (white ovals; relative occu- 
pancy = 0.2). (D) Occupancy profile and position midpoints for two 
arrays of equal occupancy but different spacing, beginning and 
ending with the same nucleosome: upper array: five nucleosomes with 
20-bp linker (165-bp repeat); lower array: four nucleosomes with 75-bp 
linker (220-bp repeat). 
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Low occupancy and disrupted position clusters on 
transcriptionally active genes 

The effect of 3AT on the chromatin structures of some 
induced genes is dramatic: nucleosome occupancy is 
heavily reduced over the entire coding region. Moreover, 
some 3AT-repressed genes show the opposite trend: occu- 
pancy increases to normal levels in 3 AT- treated cells. 
Thus, reduced occupancy correlates with transcriptional 
activation. In addition, nucleosome spacing is altered 
after induction on at least two genes (HIS 3 and LYS1). 
Altered nucleosome spacing might reflect an intermediate 
chromatin state corresponding to a level of disruption in 
between the resting state and a major loss of canonical 
nucleosomes. Only a subset of 3AT-induced genes show 
extreme loss of canonical nucleosomes. These are 
probably the most transcriptionally active genes, since 
single gene and microarray studies indicate that the 
extent of histone loss from coding regions correlates 
with heavy transcription (47-51). 

The mono-nucleosome sequencing approach identifies 
only canonical nucleosomes (i.e. those which protect 
~ 147 bp of DNA). Consequently, reduced occupancy on 
coding regions and at nucleosome-depleted regions could 
reflect actual loss of nucleosomes (resulting in free DNA), 
or the presence of 'non-canonical' nucleosomes which 
have been remodelled such that they no longer adequately 
protect their DNA from MNase. Thus, reduced occu- 
pancy on coding regions might reflect loss of the entire 
histone octamer, or of just one or both H2A-H2B 
dimers (48). Histone hexamers and H3-H4 tetramers 
protect less DNA than the octamer and DNA in this 
size range (~80 to ~ 120 bp) is not present in our prepar- 
ations of core particle DNA. Alternatively, the histones 
might still be bound to the DNA, but present in 
remodelled nucleosomes, as we have suggested previously 
(16,52). 

Loss of canonical nucleosomes can extend into flanking 
genes 

In the cases of the genes most strongly affected by 3AT, 
loss of canonical nucleosomes occurs not just over the 
coding region, but extends into neighbouring genes 
(Figure 5). It seems unlikely that this is a direct effect of 
transcription, involving RNA polymerase II ploughing on 
into the chromatin of the downstream gene after release of 
the mRNA, because in some cases upstream genes are 
affected. Previously, we have observed disruption of nu- 
cleosome positioning on the flanking TRP1 gene in CUP1 
or HIS3 plasmid chromatin after induction (11,16). 
However, since CUP1 and HIS3 were not in their native 
chromosomal contexts and the TRP1 gene was also active, 
the biological significance of the disruption of flanking 
chromatin structure is unclear. More recently, in 
Drosophila, a single-gene nucleosome scanning study has 
shown that heat shock induces nucleosome loss over a pair 
of divergently transcribed Hsp70 genes, extending in both 
directions into the flanking sequences as far as the scs and 
scs' insulating elements (50). This effect does not depend 
on transcription, but on poly(ADP-ribose) polymerase 
(50). Since there is no evidence for this enzyme in yeast, 



the mechanism in yeast must be different, perhaps 
involving remodelling by SWI/SNF, as we have 
observed previously for HIS3 (16,52). We are currently 
investigating this possibility. 

In summary, our genome-wide nucleosome sequence 
data show not only that there is a major loss of canonical 
nucleosomes from the coding regions of some 
3AT-induced genes, but also that the positioning of the 
remaining nucleosomes is heavily disrupted. Thus, the 
chromatin structure of the coding region undergoes 
major remodelling on activation, with disruption of the 
dominant nucleosomal array and loss of canonical nucleo- 
somes. This disruptive effect can be communicated to 
flanking genes through nucleosome-depleted promoters 
and 3 / -regions that are seemingly unaffected, indicating 
that they do not act as strict boundaries. The factors 
that direct the formation of these domains of altered chro- 
matin structure and determine their boundaries are cur- 
rently under investigation. 
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We have submitted four sets of paired-end sequencing 
data to the GEO database; these correspond to two inde- 
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